Using Personality Recognition Techniques to Improve Bayesian Spam Filtering

نویسندگان

Enaitz Ezpeleta

Urko Zurutuza

José María Gómez Hidalgo

چکیده

Millions of users per day are affected by unsolicited email campaigns. During the last years several techniques to detect spam have been developed, achieving specially good results using machine learning algorithms. In this work we provide a baseline for a new spam filtering method. Carrying out this research we validate our hypothesis that personality recognition techniques can help in Bayesian spam filtering. We add the personality feature to each email using personality recognition techniques, and then we compare Bayesian spam filters with and without personality in terms of accuracy. In a second experiment we combine personality and polarity features of each message and we compare all the results. At the end, the top ten Bayesian filtering classifiers have been improved, reaching to a 99.24% of accuracy, reducing also the false positive number.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Image Spam Using Image Texture Features

Filtering image email spam is considered to be a challenging problem because spammers keep modifying the images being used in their campaigns by employing different obfuscation techniques. Therefore, preventing text recognition using Optical Character Recognition (OCR) tools and imposing additional challenges in filtering such type of spam. In this paper, we propose an image spam filtering tech...

متن کامل

Image spam filtering using textual and visual information

In this paper we focus on the so-called image spam, which consists in embedding the spam message into images attached to e-mails to circumvent statistical techniques based on the analysis of body text of e-mails (like the “bayesian filters”), and in applying content obscuring techniques to such images to make them unreadable by standard OCR systems without compromising human readability. We arg...

متن کامل

Introduction of Fingerprint Vector based Bayesian Method for Spam Filtering

With the development of the diversification of spam, it raises the difficulties and challenges to content-based spam filtering. To address this problem, this paper firstly introduced the statistical features of Email headers, and then proposed a method to use these features to improve Bayesian anti-spam filter. The selected Email-header features are presented as the fingerprint vectors, and the...

متن کامل

A Fuzzy Clustering Approach to Filter Spam E-Mail

Spam email, is the practice of frequently sending unwanted email messages, usually with commercial content, in large quantities to a set of indiscriminate email accounts. However, since spammers continuously improve their techniques in order to compromise the spam filters, building a spam filter that can be incrementally learned and adapted became an active research field. Researches employed m...

متن کامل

BUPT at TREC 2006: Spam Track

This report summarizes our participation in the TREC 2006 spam track, in which we consider the use of Bayesian models for the spam filtering task. Firstly, our anti-spam filter, Kidult, is briefly introduced. And then we try to use weighted adjustment of separating hyperplane and selective classifiers ensemble to improve the filtering performance. Finally, we summarize the relevant results from...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Procesamiento del Lenguaje Natural

دوره 57 شماره

صفحات -

تاریخ انتشار 2016

Using Personality Recognition Techniques to Improve Bayesian Spam Filtering

نویسندگان

چکیده

منابع مشابه

Detecting Image Spam Using Image Texture Features

Image spam filtering using textual and visual information

Introduction of Fingerprint Vector based Bayesian Method for Spam Filtering

A Fuzzy Clustering Approach to Filter Spam E-Mail

BUPT at TREC 2006: Spam Track

عنوان ژورنال:

اشتراک گذاری